1 Tutorial goals

The goal of this tutorial is to explore how to fit hidden Markov models (HMMs) to movement data. To do so, we will use the R package momentuHMM, which was developed by Brett McClintock and Théo Michelot. This package builds on a slightly older package, moveHMM, that was developed by Théo Michelot, Roland Langrock, and Toby Patterson, see associated paper: https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.12578. momentuHMM has new features such as allowing for more data streams, inclusion of covariates as raster layers, and much more, see associated paper: https://besjournals.onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.12995.

The primary learning objectives in this first tutorial are to:

  1. Fit simple HMMs using momentuHMM
  2. Checking model fit
  3. Determining number of states to use

We also include an Extra section where you can explore:

2 Setup

First, let’s load the packages that we will need to complete the analyses. Of course you need to have them installed first.

While the raster package is being phased out in favour of the new terra package, momentuHMM still relies on the raster package to extract covariates. Therefore, for this tutorial we will still use raster, but we also use terra at times as it is most up-to-date package for rasters.

Make sure to set working directory to “ISEC2024_HMM_workshop” of the HMM workshop folder:

setwd("1_Morning_Tutorial/Tutorial_files")

We will need the functions in the following file utility_functions.R

source("utility_functions.R")

3 Movement data: Narwhals

We will analyze a dataset containing movement tracks of three narwhals tagged with Fastloc-GPS tags. The dataset was provided by Dr. Marianne Marcoux (Fisheries and Oceans, Canada). For simplicity, we only examine the fastloc-GPS data from two weeks in August 2017.

4 Import data (with description and simple processing)

HMMs assume that locations are taken at regular time steps and that there is negligible position error. For example, HMMs are adequate to use on data collected from tags that take GPS locations at a set temporal frequency (e.g., every \(2\) hrs). Without reprocessing, simple HMMs are not suitable for irregular times series of locations with large measurement error (e.g., you would need to preprocess Argos data before applying the type of HMMs presented in this tutorial).

As is the case for the data of many aquatic species,the raw narwhal Fastloc-GPS data is collected at irregular time intervals (the time of locations is dictated partly with when they surface). To simplify the tutorial, we have already regularized the tracks. Specifically, we have split tracks with data gaps of more than 1 hr into a subset of tracks, and have interpolating the locations at a 10 min interval using a continuous-time correlated random walk using the momentuHMM crawlWrap function. If interested in these pre-processing steps and decisions, see section 7.4 below.

Let’s import the regularized narwhal movement data and convert the time column to an appropriate date format.

tracks_gps <- read.csv("data/tracks_regular.csv") %>%
  mutate(time = ymd_hms(time))

We can have a first look at the data, to get more familiar with it.

head(tracks_gps)
##          ID                time         x        y
## 1 T172062-1 2017-08-08 00:20:00 -80.65250 72.27280
## 2 T172062-1 2017-08-08 00:30:00 -80.65958 72.27336
## 3 T172062-1 2017-08-08 00:40:00 -80.66562 72.27242
## 4 T172062-1 2017-08-08 00:50:00 -80.66879 72.26852
## 5 T172062-1 2017-08-08 01:00:00 -80.67032 72.26240
## 6 T172062-1 2017-08-08 01:10:00 -80.66759 72.25797

The coordinates are in Lat/Lon (x = lon, y = lat).

For some of the visualization, it’s helpful to use a spatial object, and so we transform the track data into an sf object. The coordinate reference system of the original data (in this case WGS84, which is defined by the EPSG code 4326).

tracks_gps_sf <- tracks_gps %>% st_as_sf(coords = c("x", "y")) %>% st_set_crs(4326)

To get a reference of where these narwhals are swimming, let’s import a land layer.

land <- st_read("data/NorthBaffin.shp")
## Reading layer `NorthBaffin' from data source 
##   `/home/sofia/Candu_postdoc/CANSSI/CANSSI_HMM_wk_dthon_2025/Workshop/1_Morning_Tutorial/Tutorial_files/data/NorthBaffin.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -81.38055 ymin: 71.51935 xmax: -77.43866 ymax: 73.2949
## Geodetic CRS:  WGS 84

Let’s plot the narwhal movement data over the map.

ggplot() +
  geom_sf(data=land, fill="beige") +
  geom_spatial_path(data = tracks_gps, 
                    aes(x = x, y = y, colour=factor(ID)), crs = 4326) + 
  theme_classic() +
  theme(panel.background = element_rect(fill = 'lightblue'))

We can see that even though its the data for only 3 narwhals, that there are more than 3 tracks. These represent the split tracks we created every time we had a data gap longer than 1 hr.

5 Create data streams

Steps and turns are a way to describe observed animal trajectories by measuring the distance between locations and the angle of the turns made between them.

Step length: distance between two consecutive locations

Turning angle: the angle between three consecutive locations.

We will use these measurements as observations to fit a HMM.

To derive our data streams from our observations, we will use the prepData function. This process involves transforming our data into steps and turns angles, making it ready to use with momentuHMM.

The default is UTM, but here the narwhal location data is in Latitude and Longitude, so we specify “LL” in type.

# calculate step length and turning angles
data <- prepData(tracks_gps, type = "LL")

# take a peak at the data
head(data)
##          ID      step     angle                time         x        y
## 1 T172062-1 0.2489214        NA 2017-08-08 00:20:00 -80.65250 72.27280
## 2 T172062-1 0.2308335 0.7301706 2017-08-08 00:30:00 -80.65958 72.27336
## 3 T172062-1 0.4477989 0.8532449 2017-08-08 00:40:00 -80.66562 72.27242
## 4 T172062-1 0.6847686 0.1671223 2017-08-08 00:50:00 -80.66879 72.26852
## 5 T172062-1 0.5030554 0.2611253 2017-08-08 01:00:00 -80.67032 72.26240
## 6 T172062-1 0.2394635 1.0924722 2017-08-08 01:10:00 -80.66759 72.25797

We can see that prepData has calculated the step length and the turning angle, and has extracted the bathymetry value at each location.

6 Fit HMM to data

It is now possible to now to fit a HMM on this dataset and next we will explore this in the R package momentuHMM.

The primary learning objectives of this part are:

  1. Fit simple HMMs using momentuHMM
  2. Checking model fit
  3. Determining number of states to use

6.1 Fitting the model

Our data is ready to use, and we are almost ready to fit the HMM (via the function fitHMM). We now need to make many important modelling decisions and most of these decisions will be associated with one of the arguments of the function. The minimum arguments fitHMM requires are: fitHMM(data, nbStates, dist, Par0).

  1. When we fit a HMM, we need to decide the number of behavioural states we are going to model. To start simple, we will only use two behavioural states. This means that the argument nbStates will be set to \(2\).

  2. We need to select the distribution we will use to model the step lengths and the one we will use to model the turning angles. For now, we will use the gamma distribution for the step lengths and the von Mises for the turning angles. These are commonly used distributions, to model movement data. To specify these choices we will set the argument dist to: list(step=“gamma”, angle=“vm”). Note that dist should be a list with an item for each data stream columns in the data that we are modelling (so here the column step and angle). The gamma distribution is strictly positive (i.e., it does not allow for \(0\)s). If you have step lengths that are exactly zero in your data set, you need to use zero-inflated distributions. But in this case, we have no zeros.

  3. We need to decide whether we make the transition probabilities between the behavioural states dependent on covariates. Here, we will start simple and we will not use covariates.

  4. We need to select initial values for the parameters to estimate. The HMMs are fit using maximum likelihood estimate (MLE) with a numerical optimizer. An unfortunate aspect of fitting models using numerical optimizers is that, to be able to explore the parameter space and find the best parameter values for the model, the optimizer needs to start somewhere. You need to decide where it starts. Unfortunately, choosing bad starting values can result in parameter estimates that are not the MLE, but just a local maximum. To choose your initial parameter you can take a quick peak at the data (e.g., using the plots below) and use some general biological information. For example, it’s common for animals to have one behaviour with long step lengths and small turning angles and one behaviour with short step lengths and larger turning angles.

Note, by default, fitHMM will set the argument estAngleMean to NULL, which means that we assume that the mean angle is \(0\) for both behaviours (i.e., the animal has a tendency to continue in the same direction) and that only the angle concentration parameters differ. The concentration parameters control the extent to which the animal continues forward versus turn. Doing so reduces the number of parameters to estimate. These are all very important decisions that you must make when you construct your model. It’s useful to visualise your data in order to choose initial parameters.

par(mfrow= c(1, 2))
hist(data$step, 25, main = "", xlab = "step length (m)")
hist(data$angle, 25, main = "", xlab = "turning angle")

plot(data$step ~ data$time, ty = "l", ylab = "Step length",
     xlab = "Date", las = 1)
abline(h = 0.1, col = rgb(1, 0, 0, 0.5))
abline(h = 0.6, col = rgb(1, 0, 0, 0.5))

Based on visual examination of the data, we might choose our initial parameters to be \(0.1\) and \(0.6\) for the mean step lengths and use half those values for the standard deviations. The turning angles are either very close to \(0\) or pretty spread from \(-\pi/2\) to \(\pi/2\). High concentration parameter values (\(\kappa\), said kappa) mean that the animal has a tendency to move in the same direction. Values close to 0 mean that the turning angle distribution is almost uniform (the animal turns in all directions). Note that \(\kappa\) cannot be exactly 0, so let’s choose 0.1 and 1.

So now, lets fit the model: - 2 states - Observations: Step length(gamma distribution) and turning angle (von Mises distribution) - No covariates included

# define number of states
nbState <- 2

# define distribution to use for each data observation
dist <- list(step = "gamma", angle = "vm")

# Setting up the starting values
mu0 <- c(0.1, 0.6) # Mean step length
sigma0 <- mu0/2 # SD of the step length
kappa0 <- c(0.1, 1) # Turn angle concentration parameter

# combine starting parameters 
Par0 <- list(step = c(mu0, sigma0), angle = kappa0)

Ok, we are ready. Let’s fit the HMM and look at the parameter estimates

# Fit a 2 state HMM
mod <- fitHMM(data, 
              nbState = nbState, 
              dist = dist, 
              Par0 = Par0)

# Let's look at parameter estimates
mod
## Value of the maximum log-likelihood: -1139.461 
## 
## 
## step parameters:
## ----------------
##        state 1   state 2
## mean 0.2771543 0.7494254
## sd   0.1479317 0.2487530
## 
## angle parameters:
## -----------------
##                state 1  state 2
## mean          0.000000  0.00000
## concentration 2.259034 17.58048
## 
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
##                1 -> 2    2 -> 1
## (Intercept) -2.587876 -2.551128
## 
## Transition probability matrix:
## ------------------------------
##            state 1    state 2
## state 1 0.93007723 0.06992277
## state 2 0.07235073 0.92764927
## 
## Initial distribution:
## ---------------------
##   state 1   state 2 
## 0.4602906 0.5397094

We can also use the plot function to plot the outputs. The most interesting output is usually the state-dependent distributions (i.e., the estimated distributions of step lengths and turning angles in each state). Note that setting plotTracks = TRUE will plot maps of each track segment, colored based on the estimated state. We will set it to FALSE and show a more integrative way to plot these below.

# plot state-dependent distributions
plot(mod, ask = FALSE, plotTracks = FALSE)
## Decoding state sequence... DONE

Based on the mean step length parameters, it looks like the first behavioural state has smaller step lengths compared to state 2. The turning angle distributions of the second state indicates much more directed movement, with higher angular concentration.

Based on this, we may labelstate 1 as “resident” and state 2 as “travel”.

6.2 Identifying behavioural states

We are interested in identifying when an animal is in each of the behavioural states (i.e., when the animal is in state 1 vs state 2), what its called state decoding. We can use the function viterbi, which uses the Viterbi algorithm to produce the most likely sequence of states according to your fit model and data.

# Apply the Viterbi algorithm using your fited model object
dec_states <- viterbi(mod)

# Let's look at predicted states of the first 20 time steps
head(dec_states, 20)
##  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2
# How many locations in each state do we have?
table(dec_states)
## dec_states
##    1    2 
## 1217 1217

This kind of information (number of locations in each states), can be used to get a behavioural budget for example. Here looking like that spend 50% of their time in each behavioural state.

In many cases, it is more interesting to get the probability of being in each state rather than the most likely state. To do so, you can use the function stateProbs, which returns a matrix of the probability of being in each state for each time step (this is known as local decoding).

# Calculate the probability of being in each state
statep <- stateProbs(mod)
# Let's look at the state probability matrix
head(statep)
##        state 1      state 2
## [1,] 0.9971360 2.864001e-03
## [2,] 0.9999777 2.227324e-05
## [3,] 0.9939176 6.082427e-03
## [4,] 0.5892963 4.107037e-01
## [5,] 0.6732949 3.267051e-01
## [6,] 0.9999935 6.547426e-06

We can see here the probability of being in both states for the first \(6\) time steps. Here, the probability of being in state 1 is really high for the first three steps, but that may not always be the case. Sometimes (for example steps 4 and 5) you might have values closer to \(0.5\), which would indicate that for that time step, it’s hard to identify which state you are in (i.e., which step length and turning angle distributions fit best).

You can plot the entire time series (for each ID) from both of these functions using the function plotStates. It’s often more interesting to plot the track colored by the most likely states. Here, we plot the locations colored by each form of decoding: the viterbi sequence and probability of being in state 1.

# add column for the viterbi sequence and state probabilities
data$viterbi_state <- factor(dec_states)
data$state_p1 <- statep[,1]

# Plot tracks, coloured by viterbi states
ggplot(data, aes(x, y, col = viterbi_state, group = ID)) +
  geom_point(size = 0.5) + geom_path() +
  scale_color_manual(values = c("#E69F00", "#56B4E9"))

# Plot tracks, coloured by state probabilities
ggplot(data, aes(x, y, group = ID, col = state_p1)) +
  geom_point(size = 0.5) + geom_path() + 
  scale_color_continuous(high = "#E69F00", low = "#56B4E9")

You can see that there is general agreement with the global (Viterbi) and local decoding, indicating that there is generally a high state probability and we have high confidence in our state classification.

6.3 Checking the stability of the parameter estimates

As mentioned above, the initial parameter values can affect the estimating procedures. Thus, it’s important to assess whether the parameter estimates are sensitive to the starting values we used (i.e., checking that we are not clearly at a local maximum). To do so, we check if we get similar results with different starting values. Here, we will make the starting values of the two behavioural states closer to one another.

# Lets now name the states 
stateNames <- c("resident", "travel")

# Setting up the starting values
mu2 <- c(0.400, 0.600) # Mean step length
sigma2 <- mu2/2 # SD of the step length
kappa2 <- c(1, 1) # Turn angle concentration parameter

# combine starting parameters 
Par0_2 <- list(step = c(mu2, sigma2), angle = kappa2)

# Fit the same 2 state HMM
mod2 <- fitHMM(data, 
               stateNames = stateNames, 
               nbState = 2, 
               dist = dist, 
               Par0 = Par0_2)

Let’s compare the two results. First let’s look at which of the two has the lowest negative log likelihood (equivalent of highest log likelihood, so closer to the real MLE). Let’s look also at the parameter estimates they each returned.

# Negative log likelihood
c(original = mod$mod$minimum, new = mod2$mod$minimum)
## original      new 
## 1139.461 1139.461
mod$mod$minimum - mod2$mod$minimum
## [1] 8.515144e-10
# Parameter estimates
list(original = mod$mle$step, new = mod2$mle$step)
## $original
##        state 1   state 2
## mean 0.2771543 0.7494254
## sd   0.1479317 0.2487530
## 
## $new
##       resident    travel
## mean 0.2771543 0.7494254
## sd   0.1479317 0.2487531
list(original = mod$mle$angle, new = mod2$mle$angle)
## $original
##                state 1  state 2
## mean          0.000000  0.00000
## concentration 2.259034 17.58048
## 
## $new
##               resident   travel
## mean          0.000000  0.00000
## concentration 2.259033 17.58048
list(original = mod$mle$gamma, new = mod2$mle$gamma)
## $original
##            state 1    state 2
## state 1 0.93007723 0.06992277
## state 2 0.07235073 0.92764927
## 
## $new
##            resident     travel
## resident 0.93007715 0.06992285
## travel   0.07235075 0.92764925

Looks like they both returned very similar estimates for everything, though the negative log likelihood of the second set of starting values is slightly better.

The function fitHMM also has the argument retryFits which perturbs the parameter estimates and retries fitting the model. Since the optimization of HMMs depends on the starting values chosen, it is always good to explore multiple initial values and then choose the best estimates among them (which is done by retryFits). The argument is used to indicate the number of times you want to perturb the parameters and retry fitting the model (you can choose the size of the perturbation by setting the argument retrySD).

Let’s try (this will take a few minutes).

# Fit the same 2-state HMM with retryFits
# This is a random pertubation, so setting the seed to get the same results
set.seed(29)
mod_RF <- fitHMM(data,
                 nbState = 2,
                 dist=list(step="gamma", angle="vm"),
                 Par0 = Par0,
                 retryFits = 20)
## Attempting to improve fit using random perturbation. Press 'esc' to force exit from 'fitHMM'
##     Attempt 1 of 20 -- current log-likelihood value: -1139.461             Attempt 2 of 20 -- current log-likelihood value: -1139.461             Attempt 3 of 20 -- current log-likelihood value: -1139.461             Attempt 4 of 20 -- current log-likelihood value: -1139.461             Attempt 5 of 20 -- current log-likelihood value: -1139.461             Attempt 6 of 20 -- current log-likelihood value: -1139.461             Attempt 7 of 20 -- current log-likelihood value: -1139.461             Attempt 8 of 20 -- current log-likelihood value: -1139.461             Attempt 9 of 20 -- current log-likelihood value: -1139.461             Attempt 10 of 20 -- current log-likelihood value: -1139.461             Attempt 11 of 20 -- current log-likelihood value: -1139.461             Attempt 12 of 20 -- current log-likelihood value: -1139.461             Attempt 13 of 20 -- current log-likelihood value: -1139.461             Attempt 14 of 20 -- current log-likelihood value: -1139.461             Attempt 15 of 20 -- current log-likelihood value: -1139.461             Attempt 16 of 20 -- current log-likelihood value: -1139.461             Attempt 17 of 20 -- current log-likelihood value: -1139.461             Attempt 18 of 20 -- current log-likelihood value: -1139.461             Attempt 19 of 20 -- current log-likelihood value: -1139.461             Attempt 20 of 20 -- current log-likelihood value: -1139.461

Let’s compare the results again.

# Negative log likelihood
c(original = mod$mod$minimum, new = mod2$mod$minimum,
  retryFits = mod_RF$mod$minimum)
##  original       new retryFits 
##  1139.461  1139.461  1139.461
mod_RF$mod$minimum - mod2$mod$minimum
## [1] -1.590047e-08

The maximum likelihood is again almost identical in every perturbation. However, the model second set of starting values is still slightly better (lower negative log likelihood), and would be the set of starting values to use moving forward.

momentuHMM has functions to help selecting initial parameters for complex models, which you can explore in the Extra section.

6.4 Pseudo-residuals

We fitted a model to the data and it looks like the parameter estimates are reliable, but is this a good model for our data? Can we use the model results? The best way to investigate model fit is through pseudo-residuals. Pseudo-residuals are a type of model residuals that account for the interdependence of observations. They are calculated for each of the time series (e.g., you will have pseudo-residuals for the step length time series and for the turning angle time series). If the model fit is appropriate, the pseudo-residuals produced by the functions pseudoRes should follow a standard normal distribution. You can look at pseudo-residuals directly via the function plotPR, which plots the pseudo-residual times-series, the qq-plots, and the autocorrelation functions (ACF).

plotPR(mod2)
## Computing pseudo-residuals...

The ACF plot indicate that the model does not fit the step length time series particularly well. The ACF indicates that there is considerable autocorrelation remaining in the time series, even after accounting for the persistence in the underlying behavioural states.

This could indicate that there are more hidden behavioural states. Let’s try a 3-state HMMs.

# Setting up the starting values
mu3 <- c(0.100, 0.600, 1) # Mean step length
sigma3 <- mu3/2 # SD of the step length
kappa3 <- c(0.1, 1, 1) # Turn angle concentration parameter
# combine starting parameters 
Par0_3 <- list(step = c(mu3, sigma3), angle = kappa3)

# Fit the same 3 state HMM
mod_3state <- fitHMM(data, 
                     nbState = 3, 
                     dist = dist,
                     Par0 = Par0_3)

plot(mod_3state, plotTracks = FALSE, ask = FALSE)
## Decoding state sequence... DONE

We can look at the pseudo residuals for the 3 state model:

plotPR(mod_3state)
## Computing pseudo-residuals...

It’s better. Not perfect, but less unexplained autocorrelation, especially in the step lengths. If we look at the step length distribution, we can see that we have still our state with really small steps (maybe resting?), a state with steps of medium size (maybe foraging?), and a state with large steps (maybe travelling?). As shown above, looking at the tracks can help better inspect the nature of the states.

7 EXTRAS

7.1 Adding covariates to the model

We can add covariates to our HMM to better understand drivers of behaviour. We can either include covariates on the transition probabilities (i.e., the state process) or in the state-dependent distributions (i.e., the observation process). Respectively, these answer the questions:

  • Does a covariate have an effect on the probability of switching between states?

  • Does a covariate have an effect on the movement within each state?

7.1.1 Covariates on transition probabilities

It’s probably most common to include covariates on the transition probabilities in an HMM to assess how spatial, temporal, or demographic factors influence the state-switching dynamics. For example, prey density may increase the probability of switching into a foraging state or states may follow temporal patterns throughout the day. At time \(t\), each transition probability is linked to covariates using a multinomial logit link \[\gamma_{ij}^{(t)} = Pr(S_{t+1} = j \mid S_t = i) = \frac{\exp(\eta_{ij}^{(t)})}{\sum_{k=1}^{K}\exp(\eta_{ik}^{(t)})} \] with \(K\) states and the linear predictor for \(P\) covariates \(\{\omega_1^{(t)}, \omega_2^{(t)}, \dots \omega_P^{(t)}\}\) given as \[\eta_{ij}^{(t)} = \begin{cases} \beta^{(ij)}_0 + \sum_{p=1}^{P} \beta_p^{(ij)} \omega_p^{(t)} & \text{if } i \neq j \\ 0 & \text{otherwise.} \end{cases} \] For each transition probability, \(\beta_0^{(ij)}\) is an intercept parameter, and \(\beta_p^{(ij)}\) measures the effect of the \(p\)-th covariate \(\omega_p\).

Here, we will will create a model where transition probabilities vary as a function of the bathymetry (i.e., depth of the ocean). Thus \(\omega_1\) is the bathymetry value associated with that step and we can simply specify with a formula like the one used in simple linear models.

Let’s import it bathymetry data .

bathy <- raster("data/bathy_4_narwhals.tiff")

The raw bathymetry data used to create this raster was taken from the the GEBCO global ocean and land terrain model from https://pressbooks.bccampus.ca/ewemodel/chapter/getting-bathymetry/.Note that land values of the bathymetry layer are 0s. Let’s plot the narwhal movement data over the bathymetry and land layers.

ggplot() +
  geom_spatraster(data=rast(bathy)) +
  geom_sf(data=land, fill="beige") +
  geom_spatial_path(data = tracks_gps, 
                    aes(x = x, y = y, colour=factor(ID)), crs = 4326) 

One covariate we will explore is time of day. Lets derive it

# calculate time of day variable (for later use)
tracks_gps$tod <- hour(tracks_gps$time) + minute(tracks_gps$time) / 60
formula_bathy <- ~bathy

The argument covNames will distinguish our model covariates from data streams (not needed if we are are not using covariates). If there are missing covariate values, the default is to fill the missing value with the nearest value. This can be reasonable for spatial covariates, but wouldn’t be sensible for a covariate such as time of day. We have no missing covariate values, so it’s not a concern, but this is something to be aware of when using prepData.

# calculate step length and turning angles
data <- prepData(tracks_gps, type = "LL", covNames = "tod", spatialCovs = list(bathy = bathy))

# take a peak at the data
head(data)
##          ID      step     angle                time         x        y
## 1 T172062-1 0.2489214        NA 2017-08-08 00:20:00 -80.65250 72.27280
## 2 T172062-1 0.2308335 0.7301706 2017-08-08 00:30:00 -80.65958 72.27336
## 3 T172062-1 0.4477989 0.8532449 2017-08-08 00:40:00 -80.66562 72.27242
## 4 T172062-1 0.6847686 0.1671223 2017-08-08 00:50:00 -80.66879 72.26852
## 5 T172062-1 0.5030554 0.2611253 2017-08-08 01:00:00 -80.67032 72.26240
## 6 T172062-1 0.2394635 1.0924722 2017-08-08 01:10:00 -80.66759 72.25797
##         tod     bathy
## 1 0.3333333 -492.6352
## 2 0.5000000 -433.3434
## 3 0.6666667 -384.0744
## 4 0.8333333 -347.2245
## 5 1.0000000 -271.0803
## 6 1.1666667 -316.8583

1234

To be able to fit the HMM, we need to specify initial parameters and we will use the function getPar0 to do so. This will use the estimated parameters of a previously fitted 2-state model with no covariates. Note that it will set the unknown linear predictor parameters to 0, and use the intercept of the previous model.

# set tpm formula and extract initial values from simpler model
Par0_4 <- getPar0(model = mod2, formula = formula_bathy)
Par0_4
## $Par
## $Par$step
##    mean_1    mean_2      sd_1      sd_2 
## 0.2771543 0.7494254 0.1479317 0.2487531 
## 
## $Par$angle
## concentration_1 concentration_2 
##        2.259033       17.580478 
## 
## 
## $beta
##                1 -> 2    2 -> 1
## (Intercept) -2.587875 -2.551128
## bathy        0.000000  0.000000
## 
## $delta
## [1] 0.4602853 0.5397147

Then we can fit the model. The formula argument defines the model formula for the transition probabilities, and we need to set the correct initial parameters (i.e., Par0 for the observation parameters and beta0 for the transition probability model).

# fit model
mod_tpm <- fitHMM(data,
                  nbStates = 2,
                  stateNames = stateNames, 
                  dist = dist,
                  Par0 = Par0_4$Par,
                  beta0 = Par0_4$beta,
                  formula = formula_bathy)
mod_tpm$mle$beta
##                   1 -> 2        2 -> 1
## (Intercept) -1.759360948 -2.529798e+00
## bathy        0.001879288  9.764147e-05

There are several ways to visualise the relationship. If you use the standard plot() function on the fitted HMM, you can visualise the transition probabilities directly. Another option is to plot the stationary state probabilities, which represent how likely the animal is to be in each behaviour as a function of the bathymetry.

plotStationary(mod_tpm, plotCI = TRUE)

Here it looks like the “resident” behaviour is less common than the “travelling” behaviour at shallower depth (e.g., bathy close to 0).

7.1.2 Covariates in the observation model

This section will illustrate how to model mean step length as a function of a covariate. Any parameter in the observation model (e.g., mean or sd of the step lengths, or turning angle parameters) can be modelled as a function of a covariate, which allows for the movement behaviour within each discrete state to vary.

Here, we will model mean step length based on the time of day. Since time of day is a cyclic covariate, we must define a relationship that ensures that the mean step length are similar at the very end and very beginning of the day. We can do this by specify the mean step length as a trigonometric (i.e., cyclic) function of time of day (defined as \(\tau\)),

\[\mu_{i}^{(t)} = \alpha^{(i)}_0 + \alpha_1^{(i)} \cos \left(\frac{2 \pi \tau_t}{24}\right) + \alpha_2^{(i)} \sin \left(\frac{2 \pi \tau_t}{24}\right) \] where 24 refers for the 24-hour period of interest. This relationhip can conveniently be specified with the cosinor function.

# set new formula
formula_tod <- ~cosinor(tod, period = 24)

To link the formula to the mean step length, we specify DM, which will be a list of formulas for the step length (and possibly other) parameters. For simplicity, we will only model the mean, but we could also model the standard deviation with the same relationship. We can derive the initial parameters using the getPar0 function, as described in the previous section.

# define step formula and design matrix
DM <- list(step = list(mean = formula_tod, sd = ~1))

# extract initial values from simpler 2-state model with no covariates
Par0_5 <- getPar0(model = mod2, DM = DM)

Now we can fit the HMM with covariate-dependent observation parameters.

# fit model
mod_obs <- fitHMM(data, 
                  nbStates = 2, 
                  stateNames = stateNames, 
                  dist = dist, 
                  Par0 = Par0_5$Par,
                  beta0 = Par0_5$beta, 
                  DM = DM)
mod_obs
## Value of the maximum log-likelihood: -1107.159 
## 
## 
## Regression coeffs for step parameters:
## --------------------------------------
##      mean_1:(Intercept) mean_1:cosinorCos(tod, period = 24)
## [1,]          -1.265903                         -0.02690091
##      mean_1:cosinorSin(tod, period = 24) mean_2:(Intercept)
## [1,]                          0.04151755         -0.3204209
##      mean_2:cosinorCos(tod, period = 24) mean_2:cosinorSin(tod, period = 24)
## [1,]                          -0.1354678                           0.1148526
##      sd_1:(Intercept) sd_2:(Intercept)
## [1,]        -1.874894        -1.441473
## 
## step parameters (based on mean covariate values):
## -------------------------------------------------
##       resident    travel
## mean 0.2951321 0.8660295
## sd   0.1533712 0.2365790
## 
## angle parameters:
## -----------------
##               resident   travel
## mean          0.000000  0.00000
## concentration 2.316892 17.96972
## 
## Regression coeffs for the transition probabilities:
## ---------------------------------------------------
##                1 -> 2    2 -> 1
## (Intercept) -2.664438 -2.540314
## 
## Transition probability matrix:
## ------------------------------
##            resident     travel
## resident 0.93489531 0.06510469
## travel   0.07307992 0.92692008
## 
## Initial distribution:
## ---------------------
##  resident    travel 
## 0.3364505 0.6635495

The regression coefficients are difficult to interpret for a cyclic function, but we can view the estimated relationship for each state with confidence intervals. Note, when the model has covariates, you can include plotCI = TRUE to include the 95% confidence intervals on the estimated value.

plot(mod_obs, ask = FALSE, plotCI = TRUE, plotTracks = FALSE)
## Decoding state sequence... DONE

This shows that the mean step within each state varies based on time of day, with higher step lengths in the morning. State 1 has a mean ranging from roughly 280-300 m (0.28-0.30 km) and state 2 ranges from roughly 600-900m (0.6-0.9 km).

As the observation data (i.e., the data stream: step lengths and turning angles) are the same between these models and the models above, we can use AIC to compare models with and without time of day effects.

AIC(mod)  # without time of day
## [1] 2296.921
AIC(mod_tpm)  # bathymetry effect on transition probabilities
## [1] 2291.127
AIC(mod_obs)  # time of day effect on observation parameters
## [1] 2240.318

This indicates that the model with a time-varying mean step length is the best fitting model.

For all of these new models, we have not check the starting values nor their fit as tutorial time is limited. However, these checks should be done for all models fitted, and special care should be made to check the final selected model.

7.2 Exercises

Here we will use polar bear GPS data of one adult female from: Derocher, Andrew, 2021, “Replication Data for: Togunov RR, Derocher AE, Lunn NJ, Auger-Méthé M. 2021. Characterising menotatic behaviours in movement data using hidden Markov models, Methods in Ecology and Evolution, https://doi.org/10.7939/DVN/QYMAYV.

Here we only extracted the locations data from this dataset. It’s the data from one female polar bear from April 5, 2011 to May 5, 2011. The collar collected GPS data every 30 min, and the 8 missing locations were imputted using a continuous-time correlated random walk as shown below for the narwhals (section 7.6.2). More information can be found in: Togunov RR, Derocher AE, Lunn NJ, Auger-Méthé M. 2021. Characterising menotatic behaviours in movement data using hidden Markov models. Methods in Ecology and Evolution 12: 1984-1998.

Let’s read the data and convert the time to a time format.

pb <- read.csv("data/pb_track.csv") %>%
  mutate(time = ymd_hm(time))
head(pb)
##                  time         x        y
## 1 2011-04-05 00:00:00 -84.77747 57.80716
## 2 2011-04-05 00:30:00 -84.79460 57.81190
## 3 2011-04-05 01:00:00 -84.78996 57.81931
## 4 2011-04-05 01:30:00 -84.76924 57.83419
## 5 2011-04-05 02:00:00 -84.77583 57.84343
## 6 2011-04-05 02:30:00 -84.75434 57.85447

The column time is the date and time at which the GPS location was obtained by telemetry collar, while coordinates are in degrees longitude and latitude (Unprojected locations in WGS84 (EPSG:4326). Do the following questions.

Q1. Fit a simple two-state HMM.

Q2. Look at the state-dependent distribution and describe the movement characteristics associated with each behavioural states.

Q3. Identify what proportion of time is spent in each behaviour.

Q4. Plot the track with the decoded behavioural state.

Q5. Look at the pseudo residuals of the model. Do they indicate any issues with the model?

Q6. Assess whether including time of day as a covariate in the transition probabilities improve the model fit. Interpret the results.

Q7. Check if using time of day changes the pseudo-residuals.

Q8. What modification(s) could be done to improve the model?

The answers are found in section 7.7 below. But only peak if you are stuck, instead look at the narwhal code.

7.3 Extra: Process covariates

Next, we need to derive our data streams (step lengths and turning angles) and covariates. One covariate we will explore is time of day, so we will derive it now.

# calculate time of day variable (for later use)
tracks_gps$tod <- hour(tracks_gps$time) + minute(tracks_gps$time) / 60

Now, we will get our observation variables using the prepData function.

The default is UTM, but here the narwhal location data is in Latitude and Longitude, so we specify “LL” in type.

The argument covNames will distinguish our model covariates from data streams. If there are missing covariate values, the default is to fill the missing value with the nearest value. This can be reasonable for spatial covariates, but wouldn’t be sensible for a covariate such as time of day. We have no missing covariate values, so it’s not a concern, but this is something to be aware of when using prepData.

As mentioned above, we will also explore using bathymetry as a covariate. Thus, we will extract the bathymetry values at the location using the argument spatialCovs.

# calculate step length and turning angles
data <- prepData(tracks_gps, type = "LL", covNames = "tod", spatialCovs = list(bathy = bathy))

# take a peak at the data
head(data)
##          ID      step     angle                time         x        y
## 1 T172062-1 0.2489214        NA 2017-08-08 00:20:00 -80.65250 72.27280
## 2 T172062-1 0.2308335 0.7301706 2017-08-08 00:30:00 -80.65958 72.27336
## 3 T172062-1 0.4477989 0.8532449 2017-08-08 00:40:00 -80.66562 72.27242
## 4 T172062-1 0.6847686 0.1671223 2017-08-08 00:50:00 -80.66879 72.26852
## 5 T172062-1 0.5030554 0.2611253 2017-08-08 01:00:00 -80.67032 72.26240
## 6 T172062-1 0.2394635 1.0924722 2017-08-08 01:10:00 -80.66759 72.25797
##         tod     bathy
## 1 0.3333333 -492.6352
## 2 0.5000000 -433.3434
## 3 0.6666667 -384.0744
## 4 0.8333333 -347.2245
## 5 1.0000000 -271.0803
## 6 1.1666667 -316.8583

We can see that prepData has calculated the step length and the turning angle, and has extracted the bathymetry value at each location.

7.4 Handling irregular data

First, let’s import the raw Fastloc GPS narwhal data and convert the time column to an appropriate date format.

tracks_gps_O <- read.csv("data/tracks_fastloc_gps.csv") %>%
  mutate(time = ymd_hms(time))

We can have a first look at the data, to get more familiar with it.

head(tracks_gps_O)
##        ID                time        x       y loc_class
## 1 T172062 2017-08-08 00:00:00 -80.6414 72.2694       GPS
## 2 T172062 2017-08-08 00:20:00 -80.6525 72.2728       GPS
## 3 T172062 2017-08-08 00:42:00 -80.6665 72.2719       GPS
## 4 T172062 2017-08-08 00:52:00 -80.6692 72.2674       GPS
## 5 T172062 2017-08-08 01:09:00 -80.6682 72.2582       GPS
## 6 T172062 2017-08-08 01:19:00 -80.6613 72.2574       GPS

The data we obtain is often quite messy with some records missing information and other records duplicated. We can filter records to keep only complete location data using !is.na(x) & !is.na(y). To remove duplicate records (same time, location, and location class), we will use the lag function from dplyr, which will use the value from the previous row so that we can compare to the current row.

tracks_gps_O <- tracks_gps_O %>% 
  # remove missing locations
  filter(!is.na(x) & !is.na(y),
         # remove identical records
         !(time == lag(time) & x == lag(x) & y == lag(y) & loc_class == lag(loc_class)))

Let’s plot the data over the bathymetry and land layers we have imported at the beginning of the tutorial.

ggplot() +
  geom_spatraster(data=rast(bathy)) +
  geom_sf(data=land, fill="beige") +
  geom_spatial_path(data = tracks_gps_O, 
                    aes(x = x, y = y, colour=factor(ID)), crs = 4326) 

7.5 Selecting a time interval for the HMM

An HMM assumes the observations are collected in discrete time and that there is no missing data in the predictor variables. When the data is irregular, there are two key decisions we must make, (1) the temporal resolution to use, and (2) how to address data gaps. The desired resolution depends predominantly on the biological question you are asking, as different behaviours and biological processes occur at different spatial and temporal scales (e.g., seasonal migration, daily movement between foraging and resting grounds, and fine scale foraging decisions). Generally, higher resolution data is preferred as it has more information. However, it is possible to have too-high of a resolution wherein information from fine-scale variability drowns out the signal from coarse-scale patterns of interest (e.g., seasonal migration). A good rule of thumb, is that you want 3-50 data points per behaviour. For behaviours spanning several hours, that roughly corresponds to a desired resolution between 2 min and 60 min.

First, let’s calculate the time difference between successive records using difftime and lead (compares current row to following row) and place these values in a new column called dt. Note that the time difference is in minutes (units = "mins"). For the last record of each individual (i.e., when ID != lead(ID)), we will set the value to NA.

# Calculate time difference between locations
tracks_gps_O <- tracks_gps_O %>%
  mutate(dt = ifelse(ID == lead(ID), # If next data row is same individual
                     # calculate time difference
                     difftime(lead(time), time, units = "mins"), NA))

Let’s see what resolutions may be possible in the data by looking at the most frequent time gaps.

# Visualise time differences (all and zoomed)
par(mfrow = c(1, 2))
hist(tracks_gps_O$dt, 1000, main = NA, xlab = "Time difference (min)")
hist(tracks_gps_O$dt, 1000, main = NA, xlab = "Time difference (min)", xlim = c(0,100))

# identify the most frequent dt
tracks_gps_O %>% 
  {table(.$dt)} %>% 
  sort(decreasing = TRUE) %>% 
  head()
## 
##  10  11  12  22   9  13 
## 400 145  90  64  63  63

We see that the most frequent time gap is \(10\) min, followed by \(11\), \(12\), \(22\), \(9\) and \(13\) min. We also see the majority of the gaps are \(< 60\) min, however some are in excess of \(600\) min. Because HMMs assume observations taken at regular time intervals, finer resolutions will contain more data gaps that would need to be interpolated. Frequent and large data gaps can be difficult to handle, especially as the number of missing data points approaches or exceeds the existing data. Let’s examine the potential data structure at different resolutions for the different animals.

We can now use the function p_na (in the script utility_functions.R) to look at the proportion of NAs we would get with 10, 20, 30, and 60 min resolution.

# summarise track dt
tracks_gps_O %>% 
  group_by(ID) %>% 
  summarise(p_NA_10m = p_na(time, 10),     # 10 min 
            p_NA_20m = p_na(time, 20),     # 20 min 
            p_NA_30m = p_na(time, 30),     # 30 min 
            p_NA_60m = p_na(time, 60)) %>% # 60 min
  # return formatted table
  kable(digits = 3, col.names = c("Narwhal ID", paste(c(10,20,30,60), "m"))) %>%
  kable_styling() %>% 
  add_header_above(c("", "Resolution" = 4))
Resolution
Narwhal ID 10 m 20 m 30 m 60 m
T172062 0.486 0.236 0.191 0.152
T172064 0.502 0.203 0.120 0.074
T172066 0.488 0.230 0.185 0.160

Here we see that the \(10\) min interval, around \(50\%\) of the locations would be missing.

This is a lot, but choosing finer resolutions would likely bias the results.

For this tutorial, we will use a \(10\) min resolution, though in many cases we may want to be more conservative and use a 30 min resolution.

7.6 Handling data gaps

There are several ways to deal with data gaps:

  1. Split tracks
  2. Interpolate locations
  3. Fill the gaps with NAs
  4. Multiple imputation

To create the data used in this tutorial, we combined options 1 and 2. The basic steps are:

  1. Splitting tracks
  2. Interpolating locations within each track segment

7.6.1 Splitting tracks

One way to account for missing data is to split the track where there are large gaps (i.e., assign each track segment a new individual ID). This strategy is particularly appropriate when you have long enough gaps for which interpolation method are unlikely to perform well. We can split the tracks when the gaps larger than a predetermined threshold by iterating the ID column. Here, we will use a function (found in utility_functions.R) to split the tracks. We define the maximum allowable gap (at which point it will start a new segment), as well as the shortest allowable track segment.

These are somewhat arbitrary decisions, and depend on your subsequent choices for regularisation. In this tutorial, we will be interpolating missing locations (within each segment) and so we only want to allow gaps that can reasonably be predicted. We’re using a 10-minute resolution, so we allow a 60 minute gap (i.e., we assume we can predict 6 missing locations), and we want each segment to be at least 20 locations long so that we have enough information about state transitions.

# Use function from utility_function.R to split data at gaps > 2 hours
data_split <- split_at_gap(data = tracks_gps_O, 
                           max_gap = 60, 
                           shortest_track = 20)

An alternative to this approach is to fill large gaps with NAs, but this can be problematic if there is any covariate-dependence on the transition probabilities. You can find some code further below, that handles missing data by setting the gaps to NAs, using the package adehabitatLT.

7.6.2 Interpolation (correlated random walk)

Once the track is split, there is often still irregularity within each segments, and we want to interpolate or predict new locations to form a regular grid of observations.

The simplest approach is to use linear interpolation between missing times, but a better option is to predict locations from a continuous-time correlated random walk (CTCRW). momentuHMM contains wrapper functions to interpolate missing locations by fitting a CTCRW to the data based on the crawl package by Devin Johnson and Josh London. There are many options to fit the CTCRW model, and a detailed tutorial for analysis with crawl is available here: https://jmlondon.github.io/crawl-workshop/crawl-practical.html. Let’s try to fit the most basic model using the wrapper function crawlWrap. In the most basic form, we only need to provide tracking data with the columns ID, time, x, and y and specify the desired temporal resolution.

First, let us transform the data into an sf object. crawlWrap can also take a data.frame as an argument but that would imply renaming some of our columns. It is easier to just transform the data into an sf object.

data_sf <- data_split %>%
  st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
  st_set_crs(4326) %>% # define CRS
  st_transform(2962) # reproject data to a UTM

Now we can fit the CTCRW to each track segment and create a dataframe of predicted locations.

# crawl can fail to fit periodically, so I recommend always setting a seed 
set.seed(12)

# fit crawl
crwOut <- crawlWrap(obsData = data_sf, timeStep = "10 mins", theta = c(7, 0))
## Fitting 22 track(s) using crawl::crwMLE...
## Individual T172062-1...
## Individual T172062-2...
## Individual T172062-3...
## Individual T172062-4...
## Individual T172062-5...
## Individual T172062-6...
## Individual T172062-7...
## Individual T172064-1...
## Individual T172064-2...
## Individual T172064-3...
## Individual T172064-4...
## Individual T172064-5...
## Individual T172064-6...
## Individual T172064-7...
## Individual T172064-8...
## Individual T172066-1...
## Individual T172066-2...
## Individual T172066-3...
## Individual T172066-4...
## Individual T172066-5...
## Individual T172066-6...
## Individual T172066-7...
## DONE
## 
## Predicting locations (and uncertainty) at 10 mins time steps for 22 track(s) using crawl::crwPredict... DONE
plot(crwOut, animals = "T172062-3", ask = FALSE)


# Get predicted tracks from crawl output
data <- crwOut$crwPredict[which(crwOut$crwPredict$locType == "p"),
                              c( "ID", "mu.x", "mu.y", "time")]
colnames(data) <- c( "ID","x", "y", "time")

Notice how the predicted tracks do not make perfectly straight lines through missing sections, which is an improvement on simple linear interpolation methods.

7.6.3 Pad gaps with NAs

We mentioned previously that another strategy to address data gaps is to leave the data streams (i.e., step length and turning angle) as NAs. The maximum size of a gap to allow depends on the frequency of the missing data, frequency of locations, study species, and behaviours of interest. Voiding gaps can be used on its own or in conjunction with splitting tracks. The package adehabitatLR has a function setNA dedicated to it.

Let us first install and load the package.

We will apply this to the split tracks that we used previously used in the tutorial. Here, instead of using crawl to interpolate missing locations, we are simply creating a dataframe with the missing locations set to NA (i.e., creating a regular grid of observations with some NAs).

# Create adehabitat object, containing the trajectory padded with NAs
data_ade <- setNA(ltraj = as.ltraj(xy = data_split[, c("x", "y")], 
                                   date = data_split$time, 
                                   id = data_split$ID), 
                  date.ref = data_split$time[1], 
                  dt = 10, tol = 5, units = "min")

# Transform back to dataframe
data_na <- ld(data_ade)[, c("id", "x", "y", "date")]
colnames(data_na) <- c("ID", "x", "y", "time")

7.6.4 Multiple Imputation

Multiple imputation works by first fitting a CTCRW model to the original data, second, drawing (i.e., simulating) a number of realisations of the position process based on the CTCRW model, third, fitting HMMs to each of the simulated realisations, and finally, pooling the estimated parameters. momentuHMM has several functions to implement multiple imputation. The function MIfitHMM can be used both to simulate realisations of a fitted CTCRW and fit HMMs to each one. The number of simulations is specified with nSims. We can simulate realisations without fitting HMMs by setting fit = FALSE.

Here, let’s use first fit a CTCRW model on complete tracks (not segmented) that we fit in the tutorial to simulate 4 tracks using MIfitHMM and plot them over the original track.

set.seed(12)

# transform the tracks into an sf object
tracks_gps_sf_O <- tracks_gps_O %>%
  st_as_sf(coords = c("x", "y")) %>% # converts to an sf object
  st_set_crs(4326) %>% # define CRS
  st_transform(2962) # reproject data to a UTM

#Fit the correlated random walk, MIfithmm takes a crwData object
crw_gps_10 <- crawlWrap(obsData = tracks_gps_sf_O, timeStep = "10 mins")
## Fitting 3 track(s) using crawl::crwMLE...
## crawl 2.3.0 (2022-10-06) 
##  Demos and documentation can be found at our new GitHub repository:
##  https://dsjohnson.github.io/crawl_examples/
##  
##  WARNING!!! v. 2.3.0 will be the last version of {crawl} hosted on CRAN.
##  see 'https://github.com/NMML/crawl' for any future bug fixes.
## Loading required package: foreach
## Loading required package: rngtools
## Individual T172062...
## Beginning SANN initialization ...
## Beginning likelihood optimization ...
## Individual T172064...
## Beginning SANN initialization ...
## Beginning likelihood optimization ...
## Individual T172066...
## Beginning SANN initialization ...
## Beginning likelihood optimization ...
## DONE
## 
## Predicting locations (and uncertainty) at 10 mins time steps for 3 track(s) using crawl::crwPredict... DONE
# simulate 4 realisations of the 10 min GPS CTCRW model
MI_sim_gps <- MIfitHMM(crw_gps_10, nSims = 4, fit = FALSE)
## Drawing 4 realizations from the position process using crawl...
## Running simulator for individual T172062...
## Running simulator for individual T172064...
## Running simulator for individual T172066...
## DONE
## Drawing imputation 1...
## Drawing imputation 2...
## Drawing imputation 3...
## Drawing imputation 4...
## DONE
# plot locations for first narwhal
# filter first ID from original data
track <- tracks_gps_sf_O %>% 
  mutate(x = st_coordinates(tracks_gps_sf_O)[,"X"], 
         y = st_coordinates(tracks_gps_sf_O)[,"Y"]) %>% 
  filter(ID == "T172062")
# filter first ID for each simulation
sim_tracks <- lapply(MI_sim_gps$miData, function(x){
  filter(x, ID == "T172062")})

# plot original track for first narwhal
plot(track$x, track$y, col = "red", xlab = "x", ylab = "y", asp = 1, type = "l")

# plot each simulated track
mute <- mapply(function(data, col) {
                 points(y ~ x, data = data, col = col, type = "l")
               }, data = sim_tracks, 
               col = list("cadetblue1", "cadetblue2", "cadetblue3", "cadetblue4"), 
               SIMPLIFY = FALSE)

Notice how in some areas the different simulations have generally good agreement in the likely location during gaps, while in others they diverge. Multiple imputation can be particularly powerful if we want to incorporate environmental variables, as spatially explicit variables can be extracted for each simulated track to sample the most likely conditions encountered by the animal.

7.7 Code for polar bear data exercise

7.7.1 Q1. Fit a simple two-state HMMs.

Before we can do anything, we need to prepare the data. Here for a simple HMM without any covariate, we would only need to do the following.

pb_data <- prepData(pb, type = "LL")
head(pb_data)
##        ID      step      angle                time         x        y
## 1 Animal1 1.1473358         NA 2011-04-05 00:00:00 -84.77747 57.80716
## 2 Animal1 0.8703109 -1.4149433 2011-04-05 00:30:00 -84.79460 57.81190
## 3 Animal1 2.0646586 -0.3160667 2011-04-05 01:00:00 -84.78996 57.81931
## 4 Animal1 1.1007683  1.0023161 2011-04-05 01:30:00 -84.76924 57.83419
## 5 Animal1 1.7722516 -1.1673435 2011-04-05 02:00:00 -84.77583 57.84343
## 6 Animal1 1.4010926  0.2937965 2011-04-05 02:30:00 -84.75434 57.85447

To be able to find the starting values of our model, let’s visualize our data.

par(mfrow= c(1, 2))
hist(pb_data$step, breaks=50, 
     main = "", xlab = "step length (m)")
hist(pb_data$angle, breaks=50, 
     main = "", xlab = "turning angle")

There is plenty of persistence in turning angle (very high peak at 0) and though hard to exactly pick out there are maybe two different peaks in step length, somehing around 0.2 and 1.7.

par(mfrow= c(1, 1))
plot(pb_data$step ~ pb_data$time, ty = "l", ylab = "Step length",
     xlab = "Date", las = 1)
abline(h = 0.2, col = rgb(1, 0, 0, 0.5))
abline(h = 1.7, col = rgb(1, 0, 0, 0.5))

Let’s set up the arguments for the fitHMM function. Here we are using the same distributions as for the narwhals, but use starting values for the parameter informed by the data visualization.

# Number of states
nbState_pb  <- 2
# Distribution for each data stream
dist_pb <- list(step = "gamma", angle = "vm") 
# Starting values
mu0_pb <- c(0.2, 1.7) # Mean step length 
sigma0_pb <- mu0_pb/2
kappa0_pb <- c(0.1, 2)
# Combined
par0_pb <- list(step = c(mu0_pb, sigma0_pb), angle=kappa0_pb)

Let’s fit the model with retry fit, so as to explore a few different starting values.

mod_pb_2state <- fitHMM(pb_data, 
                        nbState = nbState_pb,
                        dist = dist_pb,
                        Par0 = par0_pb,
                        retryFits = 10)
## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
##  step ~ gamma(mean=~1, sd=~1)
##  angle ~ vm(concentration=~1)
## 
##  Transition probability matrix formula: ~1
## 
##  Initial distribution formula: ~1
## =======================================================================
## Attempting to improve fit using random perturbation. Press 'esc' to force exit from 'fitHMM'
##     Attempt 1 of 10 -- current log-likelihood value: -1967.279             Attempt 2 of 10 -- current log-likelihood value: -1967.279             Attempt 3 of 10 -- current log-likelihood value: -1967.279             Attempt 4 of 10 -- current log-likelihood value: -1967.279             Attempt 5 of 10 -- current log-likelihood value: -1967.279             Attempt 6 of 10 -- current log-likelihood value: -1967.279             Attempt 7 of 10 -- current log-likelihood value: -1967.279             Attempt 8 of 10 -- current log-likelihood value: -1967.279             Attempt 9 of 10 -- current log-likelihood value: -1967.279             Attempt 10 of 10 -- current log-likelihood value: -1967.279
## 
## DONE

7.7.2 Q2. Look at the state-dependent distributions and describe the movement characteristics associated with each behavioural state.

plot(mod_pb_2state, ask = FALSE, plotTracks = FALSE)
## Decoding state sequence... DONE

State 1 is far more directed than state 2. The step length distribution attribute a clear peak for state 1 for small steps, while state 2 allow for a wide variation in step length.

7.7.3 Q3. Identify what proportion of time is spent in each behaviour.

To do this, we need to first apply the Viterbi algorithm to get the most likely state sequencess.

dec_state_pb <- viterbi(mod_pb_2state)
head(dec_state_pb)
## [1] 2 2 2 2 2 2

Let’s count how many in each state, and divided by the length of the time series.

table(dec_state_pb)/length(dec_state_pb)
## dec_state_pb
##         1         2 
## 0.3136711 0.6863289

They spend more time in state 2 than in state 1.

7.7.4 Q4. Plot the track with the decoded behavioural state

# add column for the viterbi sequence and state probabilities
pb_data$viterbi_state <- factor(dec_state_pb)

# Plot tracks, coloured by viterbi states
ggplot(pb_data, aes(x, y, col = viterbi_state)) +
  geom_point(size = 0.5) + 
  scale_color_manual(values = c("#E69F00", "#56B4E9"))

7.7.5 Q5. Look at the pseudo residuals of the model. Do they indicate any issues with the model?

plotPR(mod_pb_2state)
## Computing pseudo-residuals...

We see that there is unexplained autocorrelation in the step length. It’s worth looking at a model with more behavioural states of with additional covariates that may explain some of this correlation.

7.7.6 Q6. Assess whether including time of day as a covariate in the transition probabilities improve the model fit. Interpret the results.

To be able to use time of day in the model, we will need to prepare the data again.

pb$tod <- hour(pb$time) + minute(pb$time) /60
pb_data <- prepData(pb, type = "LL", covNames = "tod")
head(pb_data)
##        ID      step      angle                time         x        y tod
## 1 Animal1 1.1473358         NA 2011-04-05 00:00:00 -84.77747 57.80716 0.0
## 2 Animal1 0.8703109 -1.4149433 2011-04-05 00:30:00 -84.79460 57.81190 0.5
## 3 Animal1 2.0646586 -0.3160667 2011-04-05 01:00:00 -84.78996 57.81931 1.0
## 4 Animal1 1.1007683  1.0023161 2011-04-05 01:30:00 -84.76924 57.83419 1.5
## 5 Animal1 1.7722516 -1.1673435 2011-04-05 02:00:00 -84.77583 57.84343 2.0
## 6 Animal1 1.4010926  0.2937965 2011-04-05 02:30:00 -84.75434 57.85447 2.5

Now let’s create the formula for time of day.

formula_tod <- ~cosinor(tod, period = 24)

While this is the same formula as the one used for the narwhals, we will use it in the transition probability here, not as a covariate in the emission distributions.

We need to get the starting values. To be able to do so, we will need to refit the base model with the new data.

mod_pb_2state_new <- fitHMM(pb_data, 
                        nbState = nbState_pb,
                        dist = dist_pb,
                        Par0 = par0_pb)
## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
##  step ~ gamma(mean=~1, sd=~1)
##  angle ~ vm(concentration=~1)
## 
##  Transition probability matrix formula: ~1
## 
##  Initial distribution formula: ~1
## =======================================================================
## DONE
par0_pb_tod <- getPar0(model = mod_pb_2state_new, 
                       formula = formula_tod)

Now let’s fit the model where the transition probabilities are a function of time of day to change.

mod_tpm_pb <- fitHMM(pb_data,
                  nbStates = nbState_pb,
                  dist = dist_pb,
                  Par0 = par0_pb_tod$Par,
                  beta0 = par0_pb_tod$beta,
                  formula = formula_tod)
## =======================================================================
## Fitting HMM with 2 states and 2 data streams
## -----------------------------------------------------------------------
##  step ~ gamma(mean=~1, sd=~1)
##  angle ~ vm(concentration=~1)
## 
##  Transition probability matrix formula: ~cosinor(tod, period = 24)
## 
##  Initial distribution formula: ~1
## =======================================================================
## DONE

Let’s plot the stationary state probabilities.

plotStationary(mod_tpm_pb, plotCI = TRUE)

It looks like state 2 is more common early in the day than state 1.

Is it better than the simple 2-states model?

AIC(mod_pb_2state, mod_tpm_pb)
##           Model      AIC
## 1    mod_tpm_pb 3941.880
## 2 mod_pb_2state 3952.557

Yes, it looks better.

7.7.7 Q7. Check if using time of day changes the pseudo-residuals.

plotPR(mod_tpm_pb)
## Computing pseudo-residuals...

There is still unexplained autocorrelation in the step length distribution.

7.7.8 Q8. What modification(s) could be done to improve the model?

Many modifications could be done! One example, could be to look at adding another state.

Here is the code to fit a 3-state HMMs with and without time of day as a covariate for the transition probabilities. We condensed the code a bit more here. Note that we assume that the 3 state is an intermediate state between states 1 and 2.

# Model without tod
mod_pb_3state <- fitHMM(pb_data, 
                        nbState = 3,
                        dist = dist_pb,
                        Par0 = list(step = c(c(0.2, 1, 1.8), # mean
                                    c(0.1, 0.5, 0.9)), # sd
                                    angle = c(0.1, 1.0, 2.0))) # kappa
## =======================================================================
## Fitting HMM with 3 states and 2 data streams
## -----------------------------------------------------------------------
##  step ~ gamma(mean=~1, sd=~1)
##  angle ~ vm(concentration=~1)
## 
##  Transition probability matrix formula: ~1
## 
##  Initial distribution formula: ~1
## =======================================================================
## DONE
# Model with tod
par0_pb_tod_3 <- getPar0(model = mod_pb_3state, 
                       formula = formula_tod)
mod_tpm_pb_3state <- fitHMM(pb_data,
                  nbStates = 3,
                  dist = dist_pb,
                  Par0 = par0_pb_tod_3$Par,
                  beta0 = par0_pb_tod_3$beta,
                  formula = formula_tod)
## =======================================================================
## Fitting HMM with 3 states and 2 data streams
## -----------------------------------------------------------------------
##  step ~ gamma(mean=~1, sd=~1)
##  angle ~ vm(concentration=~1)
## 
##  Transition probability matrix formula: ~cosinor(tod, period = 24)
## 
##  Initial distribution formula: ~1
## =======================================================================
## DONE

Let’s look at AIC.

AIC(mod_tpm_pb_3state, mod_pb_3state)
##               Model      AIC
## 1 mod_tpm_pb_3state 3248.631
## 2     mod_pb_3state 3269.207

Note that AIC cannot be used to select the number of states. But if we compare the two models with 3 states, we see that the model with the time of day in transition probability matrix is better.

Let’s look at the pseudo residuals.

plotPR(mod_tpm_pb_3state)
## Computing pseudo-residuals...

The pseudo residuals look better (e.g., QQ plots) and the unexplained autocorrelation in steps is reduced but still problematic.

One of the things that is creating this autocorrelation is that when bears are sleeping or sitting on the ice for long periods of time, they slowly drift with the sea ice, which creates highly correlated movement (we can see the long stretches of state 1). See Togunov et al. (2021) for an HMM that accounts for sea ice drift and model the olfactory search strategy of polar bears. Togunov et al. (2021) is accompanied with a tutorial that use similar, albeit more complex, R code.

Togunov RR, Derocher AE, Lunn NJ, Auger-Méthé M. 2021. Characterising menotatic behaviours in movement data using hidden Markov models. Methods in Ecology and Evolution 12: 1984-1998.